RFM Analysis

RFM is a method used for analyzing customer value. It is commonly used in database marketing and direct marketing and has received particular attention in retail and professional services industries.

RFM stands for the three dimensions:

Customer purchases may be represented by a table with columns for the customer name, date of purchase and purchase value. One approach to RFM is to assign a score for each dimension on a scale from 1 to 10. The maximum score represents the preferred behavior and a formula could be used to calculate the three scores for each customer. For example, a service-based business could use these calculations:

Alternatively, categories can be defined for each attribute. For instance, Recency might be broken into three categories: customers with purchases within the last 90 days; between 91 and 365 days; and longer than 365 days. Such categories may be derived from business rules or using data mining techniques to find meaningful breaks.

Onc each of the attributes has appropriate categories defined, segments are created from the intersection of the values. If there were three categories for each attribute, then the resulting matrix would have twenty-seven possible combinations (one well-known commercial approach uses five bins per attributes, which yields 125 segments). Companies may also decide to collapse certain subsegments, if the gradations appear too small to be useful. The resulting segments can be ordered from most valuable (highest recency, frequency, and value) to least valuable (lowest recency, frequency, and value). Identifying the most valuable RFM segments can capitalize on chance relationships in the data used for this analysis. For this reason, it is highly recommended that another set of data be used to validate the results of the RFM segmentation process. Advocates of this technique point out that it has the virtue of simplicity: no specialized statistical software is required, and the results are readily understood by business people. In the absence of other targeting techniques, it can provide a lift in response rates for promotions.

Details about the dataset

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments. To this end, we will define the behavior of customers and create groups according to clusters in these behaviors. In other words, we will include those who exhibit common behaviors in the same groups and we will try to develop special sales and marketing techniques for these groups.

Exploratory Data Analysis

In Terms of Description:-

In Terms of Invoice:-

Missing Value Analysis

Therefore; as we can see that the Customer ID column is the only identifier; hence when they are having nan values; then we will not be having any other options.,;; other than removing them; even if/when they comprise of almost 22.5 or 23% of the data.



Also.,;; in case of Description.,;; the null/or/missing records only comprise of 0.4% of the entire data.,;; hence we can easily remove those many data/or/records ..!!..,,;;

Outlier Value Analysis

Recency - R

Therefore; we can conclude that the lowest recency value is 6., which means that the items were bought 6 days from the latest date which is 10/12/2011
Therefore; the maximum recency is 7391/12/2009.

Frequency:-

just for checking purposes:-

Maximum frequenncy:-= 391
Minimum frequenncy:-= 1

Monetary - M

Therefore; we are not having any negative amount rows records

Maximum Monetary amount spent:-= 597336.1100000003
Minimum Monetary amount spent:-= 0.0

Note that we will be finding the average Amount; which is found using:-/= Monetary/Frequency of unique months.

Note that we are doing the following because a person who has been active for a longer period of time; will have a better chance of having a higher Monetary values.
Hence by dividing the amount by the frequency of the unique months.,;; we will be able to find out the amount that they have spent only for the months that they have been active ..!!..,,;;

Exclusions.,;; -::-

Therefore.,;; we can conclude that these products were given Free of Cost (FOC's).